Issues in Text-to-Speech Conversion for Mandarin

نویسندگان

Chilin Shih

Richard Sproat

چکیده

Research on text-to-speech (TTS) conversion for Mandarin Chinese is a much younger enterprise than comparable research for English or other European languages. Nonetheless, impressive progress has been made over the last couple of decades, and Mandarin Chinese systems now exist which approach, or in some ways even surpass in quality available systems for English. This article has two goals. The first is to summarize the published literature on Mandarin synthesis, with a view to clarifying the similarities or differences among the various efforts. One property shared by a great many systems is the dependence on the syllable as the basic unit of synthesis. We shall argue that this property stems both from the accidental fact that Mandarin has a small number of syllable types, and from traditional Sinological views of the linguistic structure of Chinese. Despite the popularity of the syllable, though, there are problems with using it as the basic synthesis unit, as we shall show. The second goal is to describe in more detail some specific problems in text-to-speech conversion for Mandarin, namely text analysis, concatenative unit selection, segmental duration and tone and intonation modeling. We illustrate these topics by describing our own work on Mandarin synthesis at Bell Laboratories. The paper starts with an introduction to some basic concepts in speech synthesis, which is intended as an aid to readers who are less familiar with this area of research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mandarin speech prosody: issues, pitfalls and directions

From the perspective of speech technology development for unlimited Mandarin Chinese TTS, two issues appear most impedimental: (1.) how to predict prosody from text, and (2.) how to achieve better naturalness for speech output. These impediments somewhat brought out the major pitfalls in related research, i.e., characteristics of Chinese connected speech and the overall rhythmic structure of sp...

متن کامل

Concatenative Mandarin Tts Accommodating Isolated English Words

An experiment to explore the method realizing a concatenative Chinese TTS accommodating isolated English words is presented. The experiment was based on an existing concatenative Mandarin TTS system, developed in Motorola China Research Center. The experimental system employs an English word synthesizer based on the concatenation of speech segments stored in an English corpus. The original Engl...

متن کامل

RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion

In this paper, a recurrent neural network (RNN) based prosodic modeling method for Mandarin speech-to-text conversion is proposed. The prosodic modeling is performed in the post-processing stage of acoustic decoding and aims at detecting word-boundary cues to assist in linguistic decoding. It employs a simple three-layer RNN to learn the relationship between input prosodic features, extracted f...

متن کامل

L2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors

This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

IJCLCLP

دوره 1 شماره

صفحات -

تاریخ انتشار 1996

Issues in Text-to-Speech Conversion for Mandarin

نویسندگان

چکیده

منابع مشابه

Mandarin speech prosody: issues, pitfalls and directions

Concatenative Mandarin Tts Accommodating Isolated English Words

RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion

L2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

عنوان ژورنال:

اشتراک گذاری